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ABSTRACT 



A system and method of filtering junk e-mails. A user is 
provided with or compiles a list of e-mail addresses or 
character strings which a user would not wish to receive to 
produce a first filter. A second filter is provided including 
names and character strings which the user wishes to 
receive. Any e-mail addresses or strings contained in the first 
filter will be automatically eliminated from the user's sys- 
tem. Any e-mail addresses or strings contained in the second 
fitter would be automatically sent to the user's "in box". Any 
e-mail not provided in either of the filtered lists will be sent 
to a "waiting room" for user review. This user review results 
in the user rejecting any e-mail, the addresses as well as 
specific character strings included in this e-mail would be 
transmitted to a central location to be included in a master 
list. This master list is periodically sent to each of the users 
allowing the first filter to be updated. A collaborative filter 
is used employing message base filtering that is not effected 
by e-mail header forgery and utilizes the networked intelli- 
gence of end users to maintain a highly inaccurate and 
comprehensive filter. The collaborative filter would then use 
the real-time input from the end users to keep the users 
involved in the filtering process. 

2 Claims, 11 Drawing Sheets 
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E-MAIL FILTER AND METHOD THEREOF 

CORRESPONDING APPLICATIONS 

This application is a continuation-in-part application of 
Scr. No. 08/995,860, filed Dec. 22, 1997, now U.S. Pat. No. 
6,023,723, and claims the benefit of provisional application 
No. 60/091,935 filed Jul 7, 1998. 

FIELD OF THE INVENTION 

The present invention relates to an electronic or e-mail 
filter system as well as a method of filtering unwanted e-mail 

messages. 

BACKGROUND OF THE INVENTION 

No cogent argument can legitimately be made refuting the 
fact that technology while generally benefitting mankind, 
does have its occasional deficiencies. This is certainly true 
with respect to the communications industry. Unfortunately, 
each technology advancement relating to the ease and facil- 
ity of providing communications between various individu- 
als or companies have created minor headaches or problems. 
Although used So sporadically since the early 1920's the 
utilization of the airplane in the mail industry since the end 
of World War II allowed individuals and communities on 
both the east and west coast to be linked with one another. 
Mail sent from New York to Los Angeles would be received 
within two or three days from the date that the communi- 
cation was originally posted. Although the use of airmail had 
a salutatory affect upon the communication between indi- 
viduals and other entities, various companies seized upon 
this relatively inexpensive means of communication to inun- 
date the public with a large number of junk mail solicita- 
tions. Unfortunately, to the chagrin of many of these junk 
mail operators, the public could generally determine which 
mail was important and which was not, based upon a number 
of factors such as the type of envelopes which were utilized, 
the return address of the sender as well as the manner in 
which the sendee was addressed. Therefore, many of these 
solicitations were never opened and were merely discarded. 

The deregulation of the telecommunications industry as 
well as the increased usage of "800" type numbers has 
resulted in an increased number of unwanted telephone 
solicitations. While tending to be an annoyance, once the 
called party determines that they are not interested in any 
solicitations or the type of solicitations offered by the caller, 
the called party can merely hang up his or her receiver. 

Increased use of facsimile machines in both the work 
environment as well as personal facsimile machines at home 
created another avenue for unwanted solicitations. Since 
technology allowed a single letter of solicitation to be 
transmitted to a large number of facsimile machines with 
ease, it is easy to see that facsimile machine solicitations 
became an annoying problem, particularly when the indi- 
viduals machine was receiving a large number of correspon- 
dences utilizing the receivers own paper. Furthermore, these 
solicitations were tying up the users telephone line so that 
important messages were delayed or never received. Due to 
an outcry by the public, legislation was passed to forbid 
these types of unsolicited communications directed to fac- 
simile machines. 

The explosion in the personal computer "PC" industry has 
provided solicitors with yet another manner of sending 
unsolicited messages. More and more businesses as well as 
individual users are connected to one another over the 
Internet and Intranet 11. Similar to the situation with respect 
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to facsimile machines, a solicitor can compose a message 
and send it on the Internet and Intranet 11 to a relatively 
large number of personal computers. Although these e-mail 
messages arc not necessarily reproduced on paper in the 

5 manner that the facsimile messages were previously 
received, the receipt of these messages would prevent other 
legitimate messages from being received in a timely manner. 
Therefore, it is clear that a system and method of filtering 
unwanted e-mail messages must be developed to shield the 

10 PC user from the annoyance of unsolicited junk e-mail. 
U.S. Pat. No. 5,619,648 issued to Canale et al is directed 
to a technique for reducing the amount of junk e-mail 
received by a user in an e-mail system. As illustrated with 
respect to FIG. 1 of the Canale et al patent, a user 105 who 

35 wishes to reduce the amount of junk e-mail which is 
received, would be provided with a mail filter 109. A mail 
item 119 in the system would include a standard e-mail 
message as well as a recipient specifier 121 which uses 
non-address information to further describe the recipients 

20 who would receive the e-mail as well as a referral list 127 
which is a list of potential recipients who pass the e-mail on 
and of recipients to whom the e-mail was provided. The 
recipients specifier 129 also includes a recipient description 
125. If the recipient description specifies a recipient which 

25 is of the same kind as that specified by the user model 113, 
the mail filter 109 adds the mail item 119 to filtered mail 115. 
The mail filter 109 can utilize the information in the referral 
list 127 to indicate a chain of referrals which resulted in the 
message being directed to the user 105. While this system 

30 can be utilized to reduce a users junk e-mail, it does not 
necessarily include a filter technique in which mail sent by 
a sender included in an approved guest list filter would be 
designated as such when received by the user. Additionally, 
this system is not utilized in a manner allowing an updated 

35 master list of junk e-mail addresses or senders to be devel- 
oped and transmitted to other users in the system. 

U.S. Pat. No. 5,093,918 issued to Heyen et al; U.S. Pat. 
No. 5,283,856 issued to Gross et al; U.S. Pat. No. 5,377,354 
issued to Scannell et al; U.S. Pat. No. 5,632,011 issued to 

40 Landfield et al and U.S. Pat. No. 5,634,005 issued to Malsuo 
are all directed to various systems for sorting and managing 
electronic mail or similar messages. However, similar to the 
Canale et al patent, these patents do not describe a method 
or system in which electronic e-mail can be effectively 

45 filtered by the user as well as compiling an updatable master 
list of unwanted e-mail transmitters which is then transmit- 
ted to the end user for filter purposes. 

SUMMARY OF THE INVENTION 

50 The present invention overcomes the problems of the 
prior art by utilizing a method and system for filtering 
unwanted junk e-mail sent to the user's computer. The user 
would include various addresses or other defining charac- 
teristics in a "No Admittance List" as well as a plurality of 

55 addresses in a "Guest List". An incoming e-mail whose 
addresses are included in the "No Admittance List" would 
be immediately discarded. Any address in the "Guest List" 
would be immediately forwarded to an "In Box". Any 
address not included in either the "No Admittance List" or 

60 the "Guest List" would be forwarded to a "Waiting Room". 
The user would periodically review the e-mail included in 
the "Wailing Room". Based upon this review, the user would 
either discard the e-mail to a "Trash Bin" or would send the 
e-mail to the "In Box". The addresses of e-mails which were 

65 discarded after the users review could be automatically 
added to the "No Admittance List". Additionally, the address 
of any e-mail added to the "In Box" after the user's review 
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could also be automatically added to the "Guest List". tions link or any other communications methodology. Each 
Addresses of e-mail which are discarded would be periodi- of the users who are part of the system according to the 
cally sent to a filter server thereby adding the addresses to a present invention, would be provided with appropriate soft- 
master list. This master list is then periodically transmitted ware allowing each of the users to prepare individualized 
to all of the users in the system through a download server. 5 dual filters to automatically prevent certain unwarranted 

The following glossary of terms define various comments e ' mail * om re ^ ved " weU a * 10 fT^f^ 

i *l j • #t_ * i- *• receive desired e-mail. The software can be installed directly 

described in this application. by ^ ^ Qr wouW be pQpuIar ^ 

A Mail Server is any service that handles the Simple Mail programs, and once installed be transparent to the user. One 

Transfer Protocol (SMTP). Mail Servers are also known as of these filters is automatically updated by other users in the 

Message Transport Systems (MTS). Examples of M ail Serv- 10 system when known unwanted e-mail addresses are deter- 

ers are Sendmail, Microsoft's Exchange, etc. mined. This software would also allow the individual to use 

Mail Storage refers to any type of system for storing a customized graphic user interface to assemble the filters, 

electronic mail (usually stored per user in mailboxes). Mail A l yP ical graphic user interface will be discussed in more 

Storage can consist of file storage, a database, etc. ie delail hereinafter. However, it is noted that the exact nature 

. w ~ 0 ^ 15 of the graphic user interface can vary depending upon its 

A Mail Drop Service is any service that allows users to Uca 2 0 £ and implementation, 

direct y retrieve messages trom ueir mailboxes. Users ^e software would allow the individual user to construct 

e-mail clients usually directly interact with a Mail Drop an automatic discard filtcr 12 . automatic discard filter is 

Service via some protocol. Examples of protocols used by a collective term consisting of a user modified discard filter, 

Mail Drop Services are the Post Office Protocol (POP) and 20 a user personal address filter as well as a user personal string 

the Internet Message Access Protocol (IMAP). filtcr< Dur i ng operation of the system, the automatic discard 

A Mail Reader is any application that can send and filter 12 would include a current filter list comprising a list 

retrieve e-mail via a Mail Drop Service. Mail Readers are of active e-mail addresses against which incoming e-mails 

also known as User Agents (UA). Examples of Mail Readers are compared. This current filter list is retained in a memory 

are Qualcomm's Eudora, Microsoft Outlook, Netscape 25 section of the users computer. Any comparison between any 

Communicator, Elm, Mh, etc. incoming e-mail and the current filter list could be accom- 

0 . ' , J -ill v.j plished within the user's computer system. The current filter 

Spam is any unwanted e-mail, also known as unsolicited f. . . . . , . / . J ,. n 

t , /, T nr\ % ■, ust is maintained at the remote central location 46 as well as 

commercial e-mail (UCE) or junk e-mail. bejng periodically updated in each of the users rc syslems 

BRIEF DESCRIPTION OF THE DRAWINGS 30 «• ' emo ' e l ° c&tion 4 « 7? uld inc l ude a delu filler "™ 

22 and download server 24 for a particular user as well as 

These and other attributes of the present invention will delta server filter 26 from all other users. The current filter 

become more apparent in light of the following detailed list can be modified by the user to personally remove any 

description of an illustrative embodiment thereof, as illus- addresses therefrom through various deletion techniques, 

trated in the accompanied drawings of which: 35 thereby providing the user with a user modified discard filter, 

FIG. 1 is a process flow chart and block diagram illus- The ^ CT personal address filter would include additional 

trating the present invention- addresses the user has added to the current filter list as well 

FIG. 2 is a typical example of a graphic user interface as character stri °^ ha ' US6r has added ,, via a tex ' 

according to the present invention; ^ containing an @ • For the purpose of the present 

° r . . . ,_ invention, a text entry is a character string entered into the 

FIG. 3 is a block diagram showing various components of system by keyboard typing , Typing k mitiated by double 

the present invention; clicking or highlighting and typing, thereby clearing an old 

FIG. 4 is a typical control screen illustrating a new search string and creating a new string. When the mouse is clicked 

on a member database; on some other location or "enter" is hit, the string will be 

FIG. 5 is a control screen illustrating the search results of 45 entered into the appropriate memory structure for the new 

a member database; field. 

FIG. 6 is a control screen illustrating a new search on an The user personal string filter is defined as any character 

address database; string that the user had added to the automatic discard filter 

FIG. 7 is a control screen illustrating the search results of 10 create a " No Admittance List" via text entry that does not 

an address database; 50 contajn the "@"- T** term " No Admittance List" would be 

, , t .„ . - „ , . a list of terms and addresses included to create the automatic 

FIG « is a block diagram illustrating the collaborative d ^ ^ The "No Admittance List" 52 is included in the 

filter of the present invention; gfaphic ^ $0 in FIG 2 

FIG. 9 is a block diagram illustrating how the coUabora- The Guest List FiUer 14 indudes addresses the user has 

tive filter is updated; ^ personally added to the system, for example by dragging an 

FIG. 10 is a block diagram illustrating server side e-mail e-mail to the "Guest List" 54 shown in the graphic user 

filtering; and interface of FIG. 2, or by any other means. The Guest List 

FIG. 11 is a block diagram illustrating the collaborative Filter 14 also includes any character strings the user has 

filter replication design. added via a text entry containing an "@". 

nccrniDTiAM nemje DDCC cDDcn 60 Any e-mail received by the user is checked against the 

DESCRIPTION OF THE PREFERRED automatic discard filter 12 to determine whether any char- 

EMBODIMENTS acter string on the « No Admitlance List » 52 will bar entry of 

Turning to FIG. 1, the entire system of the present any e-mail with matching text in its address, subject line or 

invention 10 broadly includes a section 48 associated with a message body. If this occurs, that e-mail will be eliminated 

user's personal computer and a section 46 provided at a 65 from the users system as indicated by the Trash Bin 16. 

location remote from the personal computer and connected Conversely, any address contained in the Guest List which 

therewith by a standard wired or wireless telecommunica- matches an address of an incoming e-mail would be auto- 
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matically forwarded to the In Box folder 18 for review by and dragged into the Wailing Room 20 by the user using his 

the user. Similar to the situation with respect to the No or her mouse. The Waiting Room display as shown in FIG. 

Admittance List 52, a text string entered in the Guest List 54 2 would include e-mail addresses, the date and time of 

would forward all messages containing that character string receipt as well as the subject of the e-mail. The exact layout 

to the "In Box" folder 18. This feature would allow users to 5 of this Waiting Room can be changed depending upon the 

receive on-demand direct marketing information from users requirements. 

parties, promoting products for which the user has expressed ^n Empty button 56 is associated with the Waiting Room 

interest based upon the text string entered in the Guest List 20. This button discards all e-mails in the Waiting Room 

54. folder. If the folder is not empty, a pop up box will be 

Incoming e-mail which is not filtered by the automatic 10 displayed with a warning ensuring that the user wishes the 

discard filter 12 based upon the No Admittance List 52 or is Waiting Room to be emptied. If this is the case, a pop up 

included in the guest list filter 14 as embodied in the Guest button would allow the user to proceed. If the user does not 

List 54 would be automatically sent to a Waiting Room wish to empty the e-mails in the Waiting Room, the initial 

folder 20 to be individually reviewed by the user. request would be canceled and the e-mails included therein 

Unknown e-mail stopped by the automatic discard filter 15 would not be cleared. 

12 based upon the inclusion of an unwarranted character The Add to No Admittance button 58 associated with the 

string or based upon a personal review by the user would be Waiting Room 20 would add the addresses of all selected 

used to both automatically update the addresses included in e-mails in the Waiting Room to the automatic discard filter, 

the automatic discard filter as well as to alert other users in The No Admittance List in the no admittance window will 

the system of the existence of objectionable e-mail 20 scroll to reveal newly added addresses, 

addresses. These new addresses are periodically and auto- The Guest List window 54 would include a list of names 

matically transmitted to a address filter server 22 provided at 0 n the Guest List filter. Any incoming e-mail whose new 

the remote central location 46. Based upon numerical and address matches one of the addresses on this list is imme- 

temporal factors as described hereinafter, these addresses are diately forwarded to the In Box folder 18. Addresses may be 

included on the current filter list associated with the address 25 added to this list via the add to Guest List button 60, text 

filter server 22 stored in a filter database associated with a entry, or by dragging a selected e-mail to this window with 

database server 24 in communication with the address server a mouse. 

22. The In Box 18 includes only those e-mails that have 

Periodically, the database server 24 in communication 3Q successfully passed through both the automatic discard filter 

with the address filter server 22 would download updated and the Guest List Filter. Additionally, any e-mail from any 

filter addresses to the various users in the system by con- folder, may be selected and dragged into the In Box 18 by 

structing an address packet consisting of every address on the user using the mouse. Similar to the Waiting Room 20 

the current filter list since the date and time of each of the the In Box 18 includes the e-mail addresses, the date and 

users last update. The address packet is a data structure 35 time of receipt as well as the subject matter of the e-mail, 

consisting of N strings of e-mail addresses and a variable Furthermore, the particular configuration of the In Box 18 as 

containing the time of construction of the packet. The packet illustrated in FIG. 2 can be changed depending upon the 

is compressed for downloading and uploading multiple users requirements. By clicking on an open slot in the No 

e-mail addresses. Based upon the particular implementation Admittance List 52 or the Guest List 54 or by double 

of the software of the present invention, the updated version 4Q clicking on a existing text, the user may enter a character 

of the current filter list is substituted for the No Admittance string to be checked in the filtering system. Any such 

List currently provided in the users system. Alternatively, character string on the No Admittance List will bar the entry 

since the No Admittance List might include addresses and of any e-mail with matching text in its address, subject line 

character strings personally added by the user but not or message body. For example, as shown in the No Admit - 

included in the current filter list, the updated filter list would 45 tance List 52, any received e-mail with the words "free 

be compared with the automatic discard filter and any money" in its subject or message would be discarded. A text 

additional entries not included in the automatic discard filter string similarly entered in the Guest List would forward all 

would be added thereto. messages containing that character string to the In Box. Text 

FIG. 2 illustrates the In Box folder 18 and the Waiting entry can also be used to type in new e-mail addresses or edit 

Room 20 in more detail as well as giving examples of the 50 existing ones on either of the filter lists, 

type of messages included therein. The list of names The use of the click and drag technology would allow a 

included ES6 on the automatic discard filter 12 are provided graphic user to be used to transfer the address or character 

in the No Admittance List 20. Any incoming e-mail whose string of an e-mail to either the Admittance List or the Guest 

new address matches one of the addresses on this list is List. 

immediately discarded to Trash 16. Addresses may be added 55 pic. 3 illustrates a typical block diagram of the major 

to this list via an update button 61, the Add to No Admittance components of the preset invention. The present invention 

button 58, text entry, or by dragging a selected e-mail to this can be utilized by a home user 17 or by a corporate user 19 

window with the mouse. The update button 62 automatically connected to the Internet 11. The provider of the preset 

downloads the latest automatic discard filter from the down- invention is also connected to the Internet and Intranet 11 

load server 24. The updated filter list is displayed in the No 60 allowing a web server 13 to advertise the present invention 

Admittance Window. Simultaneously, user added e-mail through a home page 15. The provider connected to the 

addresses are sent to the Delta Filter Server 22 for consid- Internet and Intranet 11 at a second location allows the 

eration in future updates to the users in the system. automatic discard filters 11 of the users to be particularly 

The Waiting Room folder 20 includes only those e-mails updated. A local area network (LAN), wide area network 

that have successfully passed through both the automatic 65 (WAN) or any other type of network provided at the remote 

discard filter but are not included on the Guest List filter 54. location 46 allows the address filter server 22 to be in 

Additionally, any e-mail from any folder may be selected communication with the database server 24 as well as a filter 
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database administration tool 25 allowing the provider to 
maintain complete control over the state of the address filter 
and the filter database. FIG. 3 also illustrates the Internet and 
Intranet 11 connections between a typical home user and a 
corporate user. 

The address filter server 22 would relay e-mail filter user 
requests to the filter database of the database server 24 and 
meeting appropriate calls via RPC to a library on the 
database server or by sending SQL commences to the 
database directly. The address filter service will be imple- 
mented via connection based (TCP) communication. The 
sending of oew addresses from an e-mail filter user to the 
filter database could be handled by connectionless (UDP) 
communication since failure to handle all new addresses is 
not critical. 

FIG. 3 illustrates a situation in which the e-mail filtering 
can occur directly at the home user's PC 17 or a corporate 
user's PC 19 and communication is provided over the 
Internet and Intranet 11 to the remote location 46. However 
this invention can be practiced employing a centralized 
e-mail system database 21 connected to the home user's PC 
17 or the corporate user's PC 19 through the Internet and 
Intranet 11. In this situation the filtering is accomplished at 
the centralized e-mail system database which is the location 
of the e-mail directed to the home user or the corporate user. 

FIGS. 4-7 illustrate typical control screens used by the 
administrator of the present system at a remote central 
location. These screens are used for maintaining, searching 
and editing both the address database which consists of 
every address that has been sent to the central location 
address filter server 22 or added via a centralized control 
interface as well as the member database consisting of all 
members who have set up and updated their software. FIG. 
3 illustrates a "blank" address database screen and FIG. 4 
illustrates this database screen when information has been 
entered therein. 

Both the control screen for the member database 62 and 
the address database 70 contain a search field 64, a search 
panel 66 and a filter panel 68. The search field 64 would 
contain information matching an entry in either the address 
database or the member database. Buttons 72 and 74 would 
allow either of these databases to appear on the control 
screen. Both of these databases would include search results 
run in either the address database or the member database in 
Section 66. The current filter Section 68 would allow entries 
to be updated or saved at various times. It would also include 
a box 76 indicating the number of days an address can 
remain on the current filter list without a new instance of that 
address being uploaded by the filter users. It would also 
include a box 78 listing the minimum number of reportings 
required for an address to be placed on the current filter list. 
Certainly both of these central screens can be set up in 
multitude of ways depending upon the specific information 
to be provided. 

Returning to FIG. 1, the process of comparing received 
e-mails to both the Automatic Discard filter 12 and the Guest 
List filter 14 will now be explained. Incoming e-mails 28, 30 
and 32 are compared to information contained in the user 
modified discard filter, the user personal address fitter and 
the users personal string filter utilizing the address line, the 
subject line as well as the message body. Since the infor- 
mation included in e-mails 28, 30 and 32 are not contained 
in the automatic discard filter, all three of these e-mails are 
directly transmitted to the Guest List filter 14. The e-mail 
addresses, subject line and message body of these three 
e-mails result in a match for all three of these e-mails. 
Consequently, these e-mails are sent to the In Box folder 18. 
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E-mails 34 and 36 were sent from a known bulk e -mailer. 
Since information in these e-mails are included in the 
automatic discard filter 12, both of these e-mails are directly 
sent to trash 16. 

Four unknown e-mails 38, 40, 42, and 44 are initially sent 
to the automatic discard filter 12. E-mail filter 38 is auto- 
matically stopped by the automatic discard filter and sent to 
trash 16. Although the address of this e-mail is not initially 
included in the automatic discard filter 12 the subject line or 
message body contains a character string included in the 
automatic discard filter. The address of this e-mail is auto- 
matically added to the discard filter 12. During the next 
master filter update, this new junk e-mail address will be 
forwarded to the Delta Filter Server 22. E-mail 40 passes 
through the automatic discard filter 12 and is stopped by the 
Guest List filter 14 and is then forwarded to the Waiting 
Room folder 20. Upon review, the user decides to place this 
address on the automatic discard filter. Future e-mails from 
the same sender will be sent to trash. During the next master 
filter update, this new e-mail would be forwarded to the 
Delta Filter Server 22. E-mail 42 passes through the auto- 
matic discard filter 12 and is stopped by the Guest List filter 
14 and is then forwarded to the Waiting Room folder 20. The 
user reviews this e-mail and decides to place it on the Guest 
List. Future e-mails from the same sender will be sent to the 
In Box folder 18. 

E-mail 44 passes through the automatic discard filter and 
is stopped by the Guest List filter 14 and sent to Waiting 
Room 20. Since the user took no action with respect to this 
e-mail, it would remain in the Waiting Room folder. 

Collaborative Filtering Technology (CFT) is a filtering 
solution for stopping junk e-mail and involving end users in 
the war against spam. CFT works by leveraging the actual 
spamming experiences of end users to create a dynamically 
changing set of junk e-mail filter rules. These rule sets are 
then used to sort spam. The technology is simple, effective 
and empowers end users by involving them as active players 
in the spam wars. 

Collaborative Filtering Technology is the best e-mail 
filtering solution for organizations with numerous e-mail 
users such as Internet service providers (ISPs), free Internet 
e-mail an providers, and MIS departments of major corpo- 
rations. This technology integrates well across various plat- 
forms and infra -structure architectures, providing an 
extremely high level of end user protection with relatively 
little administrative burden. Collaborative filtering maxi- 
mizes benefits while minimizing costs. 

When an end user receives a piece of e-mail that he or she 
decides is junk mail, the user submits that message to the 
Collaborative Filter through a simple button click. The 
e-mail's body is analyzed and it is stored in the Collabora- 
tive Filter. When a small but statistically significant number 
of the same message have been submitted, the Collaborative 
Filter is updated to start filtering all such messages from the 
system. Unlike other e-mail filtering systems, the Collabo- 
rative Filter does not exclusively utilize source filtering. 
Source filtering uses an e-mail's header information to filter 
e-mail from given source addresses. Experience indicates 
that source filtering is inappropriate for completely filtering 
junk e-mail since e-mail headers are easily forged. Instead, 
the Collaborative Filter of the present invention uses mes- 
sage filtering based on an e-mail's body. Because the 
e-mail's body must contain a message (Le. advertisement) 
and this message cannot be drastically altered, the body is 
therefore the most appropriate data to be used for filtering. 

The Collaborative Filter is an implementation of server 
side filtering. Server side filtering is filtering that occurs at 
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the Mail Server or Mail Drop Service level. This is a Filter filters out the spam message intended for User B based 

different approach from client side filtering, in which filter- on User A's collaboration. 

ing occurs at the Mail Reader level. Server side filtering is Note that this is a simplified example of the filter's 

appropriate for any organization that manages its own Mail operation. There are only two end users and the Collabora- 

Servers and Mail Drop Services because it saves network, 5 five filter begins filtering after one submittal. In actuality, the 

storage resources, and end user time by stopping spam from number of end users will be far larger and the Collaborative 

propagating throughout the network. Filter will only start filtering a spam message after it has 

The Collaborative Filter is comprised of two major com- been submitted by a significant number of those users. The 

ponents: the Spam Filter and the Submittal Filter. The Spam majority of users will be Spam free like User B, while a 

Filter is responsible for filtering incoming e-mail while the 1Q sma n percentage of users experience spam and protect the 

Submittal Filter filters user submissions to the Spam Filter. larger end user community (like User A) 

These two components cooperate to .form the Collaborative The Collaborative Filter consists of two major 

filtermgproc^smatconsisuofthefollow 1 ngm a jorstepsas neDts Qamel me s Filter ^ the Submittal 

shown in FIG. 8. Filter 

All incoming messages are first filtered at the Mail Server ^ _ " _ au . , . 0 , 
using the Spam Filter If the incoming message is found on 15 ^ S P* m FlUer incomm S e-mail, and the Sliml- 
ine Spam Filter, it is discarded. All other messages pass tal FlUer mters end ^e^ junk e-mail submissions to the 
through to the Mail Drop Service. S P am Flltcn End uscr Emissions update the Spam Filter 

- ., • j- » -u * j # *u a „™ while system administrators decide what entries reside on 

E-mail is then distributed to the end user mailboxes. t „ t .... 

.... , , , . „ f „ k - ~ the Submittal Filter. System administrators can add, remove, 

When users download their e-mail from the Mail Drop ^ J . u A I , . 

0 . . . C1t , . . 0 r u >t» .u * 20 and update entries on both the Spam Filter and Submittal 

Service, it is filtered again via the Spam Filter. Those that F . . . *; 

j ju* *u *u a 11,..^, filter via a Filter Administration Tool, 

match are discarded before the users ever see them. All other , . 

messages pass safely through to the user. It is necessary to ^ Collaborative Filter is a server side filtering solution 

filter at both the Mail Server and Mail Drop Service level to as ^ owa » ^ G l9 - Service side filtermg occurs at the mail 

ensure that end users are protected by the latest updates to Server or M ali Drop Service level. This is a different 

the Collaborative Filter. However, the system would still 25 approach from client side filtermg, which occurs at the Mail 

operate if no filtering were to be done at the Mail Drop Reader level. Server side filtering is appropriate for any 

Service level organization that manages its own Mail Servers and Mail 

When end users receive a spam message, they need only Dro P Ser y ic ^ because il f ves nelwo * 

to press a button to submit it to the CoUaborative Filter. This „ r « sources b J st0 PP m S f™. from Plating throughout 

simple action forwards the junk e-mail to a mailbox (e.g. 30 f work " Anther advantage is that end users 

r \ « . J . , , c . ... . c -n~, are not forced to waste time downloading, reading, and 

spam@isp.net) where it is examined by the Submittal Filter KT . , A & ' . & '. 

, . , t A . 1 c c-i. T f^ fl deleting junk e-mail. Note that client side filtering is an 

and, if appropriate, is used to update the Spam Filter. If the & . \ . & 

j . i j • # • *u c * J, ci*-.. fU-„ ;» ;o appropriate filtermg technology when end users are not 

message does not already exist in the Spam Filter, then it is yy * 6 & ' . fll . , 

added. If it is already present, then the time of submission is „ P^? 1 &om junk e-mad by server s.de filtering products, 

recorded, and the to J number of submissions of that spam 35 ™. 10 ak ° lllus,rates a 'yP lcal server sldc &llc ™& xc ' 

is increased by one. The filter uses this counter to determine na " 0 " , , ^ . . ^ o 

whether to filter against this message. Only messages " is noted that filtenng at the Mail Drop . Service ensures 

received within a certain time frame with a counter greater that f hc most recent updates to the Spam Filter are used to 

than or equal to a predetermined threshold will be used in 4n eliminate junk e-mail that has already been sent to a user s 

filtering. This threshold ensures that a small but statistically 40 mailbox > but ha . s not yet been downloaded. This allows the 

significant number of users have submitted to the same spam Collaborative Filter to eliminate spam to most users on its 

message before it is filtered. This prevents inappropriate ^ rsl ma i un 5* 

filtering due to user errors of improper submissions. The Collaborative Filter is implemented via a set of C 

System administrators use the Filter Administration Tool 45 Ubraries - Tne Collaborative Filter has been implemented in 

to define a list of e-mail addresses or domains that cannot be C due to the following requirements: 

submitted to the Spam Filter. This list is included in the 1- The Collaborative Filter must be efficient, 

database of the Submittal Filter. This prevents end users 2. The CoUaborative Filter libraries must be cross- 

frora submitting messages from known valid sources such as platform. 

system administration broadcast messages or mailing lists. 50 3. The libraries must allow for easy integration with 

FIG. 9 illustrates an example of the operation of the customer software. 

Collaborative Filter. In this simple example, there are two A C implementation achieves these goals because: 

users (user A and user B), and one spam message sent to both 1. C is a highly efficient development language because it 

users. The spam message has never been processed before is compiled. Its long history has allowed for the devel- 

by the Collaborative Filter. 55 opment of highly optimized compilers. 

Initially, the spam message enters the system. User A logs 2. The C language is highly portable, since most hardware 

on first and checks his e-mail. User A downloads two vendors provide C compilers for their products. In 

messages: his non-spam message and a spam message. Note addition, the ANSI C standard defines a standard 

that the spam message passes through the filter since it has library, which can be reliably used to write portable 

never been seen before. User A notes the spam message and eo code. 

submits it to the Collaborative Filter via a simple button 3. C is an industry-wide development language, used by 

click. The Collaborative Filter uses User A's submission to millions of developers around the world. Therefore, it 

update the filter. From this point on, any new incoming is extremely likely that any customer can integrate the 

message that matches the submitted spam e-mail will be Collaborative Filter libraries into his/her software. In 

discarded. 65 addition, most popular development languages allow 

User B now logs on and checks her e-mail. User B only for calling C directly (e.g., C++, Java, Perl, Visual 

downloads the non-spam message because the Collaborative Basics, etc). 
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The Spam Filter Library contains the code needed to filter 
e-mail. This library contains the spamCheck( ) routine which 
is used to check if a given e-mail is junk. The caller passes 
a structure to spamCheck( ) which contains pointers to the 
headers and body of the message to be checked. Therefore, 5 
if a message's headers and body are already in memory (e.g., 
when a Mail Server is processing a SMTP connection), then 
no memory copying will be needed and spamCheck( ) can 
efficiently check whether a message is junk. 

The spamCheck( ) function checks whether a given e-mail 
is junk in the following fashion: 

1. spamCheck( ) generates a signature for the message. 

2. spamCheck( ) queries the Spam Database for the 
message's signature. 

3. If the database query does not find the message's s 
signature, then the e-mail is not junk and it an be passed 
on to end users. 

4. If the database query does find the message's signature, 
then a matching function is used to determine whether 
or not the message in questions truly matches a mes- 
sage in the Spam Filter database. 20 

a. If the matching function does not find these messages 
to be equivalent, then the message is not junk. 

b. If the matching function does find these messages to 
be equivalent, then the message is filtered. 

Source filtering is based on a message's sender informa- 25 
tion. Most source based filtering techniques use the "From" 
address from the message's header. Source based filtering is 
not appropriate for completely filtering junk e-mail since 
headers are easily forged. To overcome this limitation, the 
Collaborative Filter primarily uses the message body for 30 
filtering. This is known as message based filtering. 

The signature described in the above process is a hash 
function based on the message's body. Message signatures 
are very important because they allow the Collaborative 
Filter to operate in an efficient manner. The message signa- 35 
ture enables non-junk e-mail messages to quickly pass 
through the filter. This occurs because it is extremely 
unlikely that an incoming non-junk message signature will 
match the signature of a junk e-mail already stored in the 
database. 40 

Since the message signature is a type of hashing function, 
there will be some unavoidable signature collisions (i.e., two 
unique messages which generate the same signature). The 
filtering algorithm resolves signature collisions by calculat- 
ing a matching function on both messages to ascertain if 45 
these messages are really equal. The matching function uses 
a combination of techniques (e.g., checksum, fuzzy 
matching) to generate a likelihood that two messages are 
essentially equivalent. Exact comparisons cannot be used 
since junk e-mail senders will embed extra characters in 50 
their outgoing e-mails to circumvent message based filtering 
techniques. For example, spammers may add extra charac- 
ters at the beginning of a message by including personalized 
salutations. A fuzzy matching function is the appropriate 
solution to this problem because a spammer cannot change 55 
that portion of an e-mail's body that contains his or her 
message (e.g., advertisement). 

The Submittal Library contains the code needed to handle 
user submissions to the Collaborative Filter. The library 
contains the spamSubmit( ) routine which is used to submit 60 
a user's junk e-mail message to the Collaborative Filter. 

End users will have a mechanism for forwarding a piece 
of junk e-mail to the Collaborative Filter. This mechanism 
will forward the junk e-mail message to a defined mailbox 
for handling junk e-mail (e.g. spam@isp.net). The Submittal 65 
Library is then used to process these incoming junk e-mail 
submissions. 
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The main job for the Submittal Library is to filter incom- 
ing submittals to ensure that valid messages are not included 
in the spam filter. Examples of such messages are system 
administration broadcast messages and mailing lists. 

The spamSubmit( ) routine submits spam to the Spam 
Filter in the following fashion: 

1 . The forwarded junk e-mail is parsed from the submis- 
sion e-mail sent by an end user. 

2. The Submittal Filter is checked to determine whether 
this submitted junk e-mail should be added to the Spam 
Filter. 

3. When a submitted junk e-mail is added to the Spam 
Filter, the Spam Filter is updated by spamSubmit ( ). 

Note that the Submittal Filter is not updated via the 
spamSubmit ( ) routine. The Submittal Filter is updated by 
the Filter Administration tool that uses direct SQL com- 
mands to add/remove/update entries on the Submittal Filter. 

The Spam and Submittal Filters are both stored in rela- 
tional databases. Relational database technology was chosen 
due to its mature nature and ability to handle numerous 
transactions. These are extremely important considerations 
since the Collaborative Filter will filter all incoming e-mail. 

The Collaborative Filter only interacts with relational 
databases via SQL commands. This allows the Collaborative 
Filter to be integrated with any SQL compliant database. 
Leveraging existing database technology allows the execu- 
tion of a filtering product that is more cost effective, efficient 
and reliable. In addition, since the filter back-end is imple- 
mented via a database, customers and other third parties can 
access the filter's data for their own specialized needs. 

By using off the shelf relational database products, the 
Collaborative Filter can utilize those products' replication 
technology to scale the Collaborative Filter across the enter- 
prise as shown in FIG. 11. The Collaborative Filter's repli- 
cation design is based on one master database and numerous 
read-only replicated sites. This simple replication strategy 
contains no update conflicts and is available from various 
database vendors (e.g., Sybase's Replication Server, Oracle 
Snapshots). 

Under this scheme, network bandwidth allocation is 
flexible, since the schedule for database replication can be 
modified to meet other operational needs. Administrators 
have control over how much network bandwidth they are 
willing to spend for more effective junk e-mail filtering (i.e., 
the more synchronized the databases, the more effective the 
filter). 

The Filter Administration tool allows system administra- 
tors to administrate the Spam Filter and Submittal Filter. 
This tool is the main mechanism for adding, removing and 
updating entries on the Submittal Filter. 

The Filter Administration Tool consists of a web based 
front end making Hyper Text Transfer Protocol (HTTP) 
requests to a Java Servlet interacting with the Spam Filter 
and Submittal Filter databases with Java Database Connec- 
tivity (JDBC). This design has the following advantages. 

1. Administration can use any web browser to adminis- 
trate the Collaborative Filter. 

2. This Java solution allows the Filter Administration Tool 
to be both platform and browser independent. 

3. The performance requirements of the application are 
well within the performance parameters of Java. 

The Collaborative Filter is designed to be tightly inte- 
grated with a customer's existing Mail Servers, Mail Drop 
Services, and relational databases. Due to the range of 
customer requirements, one should anticipate that some 
amount of custom coding would likely be required to 
integrate the Collaborative Filter into a customer's opera- 
tions. 
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A Mail Server and Mail Drop Service Integration is the 
most involved step in integrating the Collaborative Filter 
since source code modifications must be made to a custom- 
er's Mail Server and Mail Drop Service. The Mail Server 
and Mail Drop Service must be modified to call the 
spamCheck( ) routine at appropriate places. 

However, even if customers do not have access to their 
Mail Server or Mail Drop Service source code, the Collabo- 
rative Filter can still be integrated in a proxy application. For 
example, a proxy SMTP application can be built which sits 
on top of the Mail Server and makes calls to the Spam Filter 
library. E-mail that passes the filter in the proxy application 
is then forwarded to the Mail Server by the proxy. Note that 
the same can be done for POP3 and IMAP4 Mail Drop 
Services. 

A relational database system must be allocated to hold the 
Collaborative Filter. The system must be able to handle the 
extra bandwidth generated by queries form the Spam Filter 
library. For large organizations, it is recommended to use a 
dedicated database system for the Collaborative Filter. Cus- 20 
tomers should note that the Collaborative Filter has an 
unusual query load compared to most on-line transaction 
processing systems, since over 90% of its requests will be 
pure queries (i.e., there will be very few inserts, updates or 
deletes) . Due to this unique query load, a dedicated database 25 
system that can be optimized for filtering is recommended. 

Once a database system has been allocated, installation 
consists of running several SQL scripts to install the Col- 
laborative Filter. When the database has been installed, the 
customer will have to perform standard database adminis- 30 
tration tasks (e.g., backup). 

Once the Spam Filter and Submittal Filter databases have 
been set up, the Filter Administration Tool's Java Servlet 
must be installed on a Web server that has access to these 
databases. System administrators will then connect to this 35 
Web server to administrate the Collaborative Filter via their 
Web browser. 

The present invention has been explained with respect to 
specific arrangements and methods. However, it is noted that 
these arrangements and methods are merely illustrative of 40 
the principles of the present invention. Numerous modifi- 
cations in form and detail may be made by those of ordinary 
skill in the art without departing from the scope of the 
present invention. Although this invention has been shown 
in relation to a particular preferred embodiment, it should 45 
not be considered to be so limited. 

What is claimed: 

1. A method for mail server side filtering electronic mail 
received over a communication medium comprising the 
steps of: 

providing a first filter at the mail server including a list of 
spam messages which should not be sent to an end user; 

receiving a first electronic message ultimately intended 
for one or more end users at the mail server; 
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comparing at the said mail server said received first 
electronic message to said spam messages provided in 
said first filter, said received first electronic message 
discarded if said received first electronic message is 
included as a spam message in said first filter and 
transmitting said first electronic message to the end 
user if said received first electronic message is not 
included as a spam message in said first filter; 

providing a second filter at the mail server for the receipt 
of a second electronic message sent over the commu- 
nications medium from an end user, said second elec- 
tronic message received at said second filter considered 
to be a spam message by the end user; 

providing a counter associated with said second filter; 

counting the number of repeated second electronic mes- 
sages received by said second filter; 

adding said second electronic message to said first filter as 
a spam message, if said counter exceeds a predeter- 
mined value; and 

adding said second electronic message to said first filter as 
a spam message, if said second filter determines that 
said second electronic message is a spam message. 

2. A system for filtering electronic mail received over a 
communication medium to an end user's computer, com- 
prising: 

a mail server for receiving first electronic messages 
intended to be received by the end user, said mail server 
provided at a location remote from the end user; 

a first filter located at said mail server, said filter including 
a list of spam messages which should not be sent to the 
end user, said first filter also including a device for 
comparing at least a portion of the body of said first 
electronic messages with each of said spam messages; 

a second filter located at said mail server for receiving 
second electronic messages transmitted over the com- 
munication medium to said mail server from the end 
users; 

a counter and comparison device located at said mail 
server and in communication with said second filter for 
counting and classifying the number of said second 
electronic messages received by said second filter, said 
counting and comparison device determining the num- 
ber of similar second electronic messages received by 
said second filter; 

wherein when the number of similar second electronic 
messages received by said second filter exceeds a 
predetermined number, said similar second electronic 
message is added to said first filter as an additional 
spam message. 
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